Feature/auditable behavior #12

mringler · 2024-05-22T21:25:25Z

Creates a table storing audit information for the table holding the behavior.

CRUD operations on the audited table will automatically create entries in an automatically generated audit table, describing the change.

Parameters

The behavior accepts most of the parameters from synced table behavior (see #10). Additionally, auditable behavior uses these parameters:

Parameter	Value	Description	Default
id_field_name	literal	Name of the id column in the audit table	`audit_id`
audited_at_field_name	literal	Name of the column storing the audit timestamp	`audited_at`
audit_event_field_name	literal	Name of column storing the audit event (insert, update, delete)	`audit_event`
changed_values_field_name	literal	Name of the column storing the changed values	`changed_values`
changed_values_field_type	literal	Propel text column type of the changed values column	`JSON`
changed_values_field_size	number	Size of the changed values column	`null`
ignore_fields	CSV	Names of columns that are not part of the audit	`null`
omit_value_fields	CSV	Names of columns where audit should not include column value (like passwords)	`null`
omit_value_types	CSV	Propel column types where audit should not include column value	`BLOB, CLOB`
omit_value	string	The default placeholder used for values of omitted fields and types in the audit	`changed`
audited_columns_on_insert	CSV	By default, the insert audit is empty, this parameter allows to specify columns that should be added.	`null`
cascade_delete	`true`\|`false`	Remove audit when source row is deleted.	`false`
add_aggregation_to_source	literal\|`true`\|`false`	Adds a column storing the number of audits for each row to the source table. Value is either name of column or "true" for "number_of_audits".	`false`

This implementation is somewhat specific, in that it tries to avoid duplicating the whole row data on input. Subsequent updates store the overridden values, which allows to rebuilt the row history backwards from the current row data. This approach has benefits, but it comes at a cost, particularly its fragility to manual row updates without audits.

The problem with manual updates that don't add audits are not just unaccounted, but they lead to wrong audits. Currently, the stored data of an audit would look something like this:

TYPE	USER	VALUES
insert	A
update	B	fruit=apple

The recorded values are the ones being overridden by the update, which is why no values need to be recorded on insert. If current value of the fruit column is "orange", you would get this audit:

user A inserted fruit=apple
user B updated fruit=orange

However, the actual events with manual update might have been:

user A inserted fruit=null
manual update fruit=apple
user B updated fruit=pear (storing override of value "apple")
manual update fruit=orange

So instead of showing unaccounted changes, the audit just claims that the users inserted data which they did not insert. This is very dicey, if acceptable at all.

Having to rebuild the audit after reading it from DB adds another layer of possible mistakes and errors.

The clean approach would be to store the new values, instead of the overridden ones:

TYPE	USER	VALUES
insert	A	fruit=null
update	B	fruit=orange

possibly even the old values if you want to have at least some idea about manual updates:

TYPE	USER	OLD	NEW
insert	A		fruit=null
update	B	fruit=apple	fruit=pear

With this, the reported user input is always correct, even if the values do not match the row values due to manual updates.

However, this requires to duplicate all values on input, even when you know that most values will never change.

So if you know that there are no manual updates (or those updates also add audits), the first approach is more efficient, as it allows for faster inserts and requires less space.

Looking back, going the easy route would have spared me time and nerves, but I really didn't want to make full duplicates (I am dealing with rather large text columns that typically won't change after insert).
But with all the caveats and sensibilities, I am not sure if this works in a general case.

codecov · 2024-05-28T13:02:20Z

Codecov Report

Attention: Patch coverage is 60.84538% with 352 lines in your changes missing coverage. Please review.

Project coverage is 69.58%. Comparing base (ef6e175) to head (5dfd469).
Report is 23 commits behind head on main.

❗ Current head 5dfd469 differs from pull request most recent head a086023

Please upload reports for the commit a086023 to get more accurate results.

Files	Patch %	Lines
...pel/Generator/Behavior/SyncedTable/TableSyncer.php	58.37%	87 Missing ⚠️
...rator/Behavior/Versionable/VersionableBehavior.php	0.00%	64 Missing ⚠️
...nerator/Behavior/Archivable/ArchivableBehavior.php	0.00%	31 Missing ⚠️
...rator/Behavior/ConfigStore/ConfigStoreBehavior.php	0.00%	30 Missing ⚠️
...avior/SyncedTable/EmptyColumnAccessorsBehavior.php	0.00%	29 Missing ⚠️
...rator/Behavior/SyncedTable/SyncedTableBehavior.php	63.75%	29 Missing ⚠️
...nerator/Behavior/ConfigLoad/ConfigLoadBehavior.php	0.00%	25 Missing ⚠️
...ior/SyncedTable/SyncedTableBehaviorDeclaration.php	71.42%	18 Missing ⚠️
...erator/Behavior/ConfigStore/ConfigurationStore.php	0.00%	14 Missing ⚠️
...nerator/Behavior/ConfigStore/ConfigurationItem.php	0.00%	7 Missing ⚠️
... and 7 more

Additional details and impacted files

@@             Coverage Diff              @@
##               main      #12      +/-   ##
============================================
+ Coverage     68.56%   69.58%   +1.01%     
- Complexity     8065     8305     +240     
============================================
  Files           232      259      +27     
  Lines         24550    25467     +917     
============================================
+ Hits          16832    17720     +888     
- Misses         7718     7747      +29

Flag	Coverage Δ
5-max	`69.58% <60.84%> (+1.01%)`	⬆️
7.4	`69.58% <60.84%> (+1.01%)`	⬆️
agnostic	`?`
pgsql	`69.58% <60.84%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kasparsatke · 2024-06-29T10:58:12Z

Hi Moritz,

following up on your last comment in our discussion (propelorm#1983 (comment)) on the main repo - one way to check back for manual DB edits could be accomplished with triggers.

The triggers on the audit table (BEFORE and maybe also AFTER) instantiate a communication with Propel. Propel will then know if there have been any manual CRUD on the audio table.

As Propel should basically only ever do INSERT operations on the audit table (new row) if I understand you right - Propel should hand over a UUID or hash (each maybe salted with environment specific data) that is valid only for the time of the specific operation Propel does on the audit table. This value should be saved in another row field of the audit table (so you have two fields on every insert row in the audit table) and the UUID should maybe also be updated on each operation on the original table.

The UUID can basically function as a lock value. On any conducted operation on the DB, the triggers will give back the value to Propel. An additional value to check would be the connection id which should also be unique.

Communication with Propel can be done for example via a User Defined Function to call Propel. Although UDFs are a bit hacky and seem to be not evenly supported across different DB systems. Another option to call Propel might be a service broker.

If it’s basically not feasible at all to let the DB call Propel by itself there would still be the option to check the UUID of the row(s) of the audit table everytime Propel does a DB operation. Then the last UUIDs (and connection ids?) should probably be stored somewhere (encrypted?) by Propel to have a reference. One could also compare with the log files of the DB (if activated).

In this way you could in theory alert Propel for external changes in the DB. Additionally, one could try to never release the write lock on the audit table. Yet this is prone to fail with connection timeouts as the lock will be released after a certain timeout I believe.

All in all, each of the ideas provided above seem to be not 100% reliable/feasible yet a combination of them could accomplish a quite good safety measure for dealing with external DB edits.

As mentioned in propelorm#1983 (comment), there would also still be the possibility to move the audit table completely to a dedicated DB which uses the archive engine to prevent UPDATE and DELETE operations.

I hope, these ideas can be a way to make your auditable behavior more reliable.

All the best,
Kaspar.
P.S.: Ich merke gerade - jetzt hab ich die ganze Zeit in Englisch geschrieben, obwohl ich gar nicht weiß, ob hier noch jemand anderes die Diskussion verfolgt. 😁

kasparsatke · 2024-06-29T14:13:30Z

Addendum to my last option above:

While doing a somewhat unrelated Google query for „propel orm leadership“, I stumbled upon an elegant solution Propel supports to work with several DBs at once. I wonder how Google sometimes creates its results. Maybe because I did wildly search for different Propel related stuff today. 😄

Anyways: Propel allows to directly access different DBs within one schema.xml even with FKs across the different DBs which are then directly callable as related objects. One has has just to declare a schema parameter for the core DB within the schema.xml and than a foreignSchema parameter for every foreign key that lies within in a different DB, e.g. a DB with the archive engine and audit table. It is also possible to define the foreign DB directly within the same schema.xml.

Example (in German) given here: https://ansas-meyer.de/programmierung/mysql/propel-fremdschluessel-mehrere-datenbanken/

I hope that adds another puzzle piece towards sketching a solution.

Best,
Kaspar.

mringler force-pushed the feature/auditable_behavior branch 6 times, most recently from 7517893 to 0d11e67 Compare May 24, 2024 09:03

mringler added 3 commits May 24, 2024 16:19

Add behavior config behavior

26c31b7

Fix @see link

f1e3f8b

Remove old columns on Index::setColumns()

501ac17

mringler force-pushed the feature/auditable_behavior branch 2 times, most recently from e9689a5 to 5dfd469 Compare May 28, 2024 12:58

mringler force-pushed the feature/auditable_behavior branch 7 times, most recently from 5dca523 to 540d035 Compare May 30, 2024 23:56

mringler added 2 commits June 7, 2024 09:50

Add synced table behavior

c287405

Add Auditable behavior

a086023

mringler force-pushed the feature/auditable_behavior branch from 540d035 to a086023 Compare June 7, 2024 07:51

kasparsatke mentioned this pull request Jun 28, 2024

Propel <behavior name="versionable" /> doesn't preserve foreign key relationsships propelorm/Propel2#1983

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/auditable behavior #12

Feature/auditable behavior #12

mringler commented May 22, 2024 •

edited

Loading

codecov bot commented May 28, 2024 •

edited

Loading

kasparsatke commented Jun 29, 2024 •

edited

Loading

kasparsatke commented Jun 29, 2024

Feature/auditable behavior #12

Are you sure you want to change the base?

Feature/auditable behavior #12

Conversation

mringler commented May 22, 2024 • edited Loading

Parameters

codecov bot commented May 28, 2024 • edited Loading

Codecov Report

kasparsatke commented Jun 29, 2024 • edited Loading

kasparsatke commented Jun 29, 2024

mringler commented May 22, 2024 •

edited

Loading

codecov bot commented May 28, 2024 •

edited

Loading

kasparsatke commented Jun 29, 2024 •

edited

Loading